課程名稱 |
資料科學之計算方法與工具 Computational Methods and Tools for Data Science |
開課學期 |
104-2 |
授課對象 |
理學院 應用數學科學研究所 |
授課教師 |
王偉仲 |
課號 |
MATH5024 |
課程識別碼 |
221 U6820 |
班次 |
|
學分 |
3 |
全/半年 |
半年 |
必/選修 |
選修 |
上課時間 |
星期二7,8,9(14:20~17:20) |
上課地點 |
天數101 |
備註 |
與陳君厚合開 總人數上限:80人 |
Ceiba 課程網頁 |
http://ceiba.ntu.edu.tw/1042MATH5024_DS |
課程簡介影片 |
|
核心能力關聯 |
本課程尚未建立核心能力關連 |
課程大綱
|
為確保您我的權利,請尊重智慧財產權及不得非法影印
|
課程概述 |
課程配合網站:https://sites.google.com/a/math.ntu.edu.tw/t-2016s-data/
This course will cover the following topics.
(I) Singular value decomposition (SVD) and principal component analysis (PCA)
- Fundamentals and computations of SVD
- Fundamentals and computations of PCA
- Proper orthogonal modes and robust PCA
- Random sampling and random projections
- Randomized algorithms for low-rank matrix approximation
- Application in oscillating mass and dimensionality reduction
- Hands-on experiments and implementations in MATLAB
(II) Extended SVD and PCA
- Sparse SVD and sparse PCA
- Nonnegative SVD and nonnegative PCA
- Tensor decomposition, high order SVD, multilinear PCA, and nonnegative tensor decomposition
- Multilinear algebra
- Application in Cryo electron microscopy images
- Hands-on experiments and implementations in MATLAB
(III) Independent Component Analysis (ICA)
- Introduction to ICA
- Principal component analysis (whitening) as the preprocessing
- ICA via optimization approaches
- ICA by maximization of nongaussianity
- ICA by maximum likelihood estimation
- ICA by minimization of mutual information
- Applications in blind source separation for audio signals and images
- Hands-on experiments and implementations in MATLAB
(IV) Image Processing and Analysis
- Basic concepts of images
- Linear filtering for image denoising
- Diffusion and image processing
- Image recognition via machine learning
- SVD and linear discrimination analysis
- Application in real images
- Hands-on experiments and implementation in MATLAB
(V) Visualization and Exploratory Data Analysis (EDA)
- Interactive graphics and EDA
- General introduction to statistical graphics and matrix visualization
- Visualization for continuous and binary data
- Visualization for categorical data
- Visualization for data with cartography links
- Covariate-adjusted data visualization and visualization for other types of data and modeling
- Visualization for symbolic data and for big data |
課程目標 |
This course intends to prepare students for solving contemporary data science problems numerically. Focusing on latest data science applications, successful students of this course will be equipped with a solid background and proficiency in modern yet fundamental computational methods and tools, so that they will be able to translate theoretical concepts into working computer programs. |
課程要求 |
Programming languages (mainly in MATLAB while C is helpful to projects), Calculus, Linear Algebra, Statistics |
預期每週課後學習時數 |
|
Office Hours |
另約時間 |
指定閱讀 |
(1) Data-Driven Modeling & Scientific Computation: Methods for Complex Systems and Big Data Paperback (2013), J. Nathan Kutz (Lectures Notes: http://goo.gl/otL0qc)
(2) Handbook of Data Visualization (2008), Editors: Chun-houh Chen, Wolfgang Karl Härdle, Antony Unwin, (Eds.) (Springer Handbooks of Computational Statistics). To download chapters: http://www.springer.com/us/book/9783540330363 |
參考書目 |
- Randomized Algorithms for Matrices and Data (2010), Michael W. Mahoney
- Applied Numerical Linear Algebra (1997), by James W. Demmel
- Numerical Linear Algebra (1997) by Lloyd N. Trefethen and David Bau III
- An Introduction to Parallel Programming Hardcover (2011), Peter Pacheco
- Parallel Computing for Data Science: With Examples in R, C++ and CUDA (2015),
Norman Matloff
- Independent Component Analysis (2001), Aapo Hyvärinen, Juha Karhunen, Erkki Oja
- Exploratory Data Analysis (1977), John W. Tukey.
- Tensor Decompositions and Applications (2009), Tamara G. Kolda and Brett W. Bader
- Nonnegative Matrix and Tensor Factorizations: Applications to Exploratory Multi-Way
Data Analysis and Blind Source Separation (2009), Andrzej Cichocki, Rafal Zdunek, Anh
Huy Phan, Shun-ichi Amari |
評量方式 (僅供參考) |
No. |
項目 |
百分比 |
說明 |
1. |
Homeworks |
30% |
|
2. |
Team Project |
40% |
|
3. |
Midterm |
30% |
|
|
|